RFNDIR[162,RWF] - www.SailDart.org

perm filename RFNDIR[162,RWF] blob sn#856248 filedate 1988-04-21 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00032 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00005 00002	@make(article)
C00007 00003	\rm
C00009 00004	Input data to a program are  N  random real numbers between 0 and 1. The
C00010 00005	(A) A binary radix sort begins by partitioning the data according to whether
C00011 00006	(A) Adapt the simple version of Quicksort (M=1) to locating the j   largest
C00012 00007	et w that  O(n lg n) comparisons are needed on the average to test whether
C00013 00008	An interpolation search is suitable for a sorted table where the data have
C00014 00009	Suppose I sort the telephone directory entries by a recursive distribution sort,
C00016 00010	Using 3 tapes, it is possible to sort, starting with sorted internal memory
C00017 00011	A text contains  N  words.  The probability that each word is the  ith  word in
C00018 00012	I am trying to locate input  x (0≤x<1) in the sequence of intervals (y↓{i-1},y↓i),
C00019 00013	Represent an algorithm by a branching tree, where the leaves are labeled with
C00020 00014	An algorithm to sort random real numbers between 0 and 1 starts by partitioning
C00021 00015	(1) If  A[0..N]  contains a sorted list of random reals drawn from (0,1), what
C00022 00016	Suppose the threshold value for Quicksort is chosen to be 
C00023 00017	CS162
C00024 00018	1. (25 points)
C00026 00019	2. An ordered binary tree is one in which the internal nodes contain keys which 
C00027 00020	3. (10 points)
C00028 00021	4. (10 points)  Design a data structure with these operations:
C00029 00022	5. (20 points)
C00031 00023	6. (15 points)
C00033 00024	For fixed  i,  let  t(i)  be the number of passes used by cocktail-shaker sort
C00034 00025	7. (10 points)
C00035 00026	\rm
C00041 00027	\input basic
C00042 00028
C00054 00029	\section{4. Probability Spaces for Random Graphs}
C00067 00030
C00087 00031	\section{11. References}
C00094 00032
C00095 ENDMK
C⊗;
@make(article)
@begin(majorheading)
Course Fact Sheet
@end(majorheading)
@begin(majorheading)
C.S. 162 Sorting and Searching
@end(majorheading)
@begin(heading)
Winter Quarter 1983-84
@end(heading)
@begin(description)
@b(Time):@\Mon. Wed. Fri. 3:15 p.m.

@b(Place):@\ Skilling 191

@b(Instructor):@\ Robert W. Floyd (RWF@@SAIL)
@\ Margaret Jacks Hall (Bldg. 460) Room 338
@\ Phone: 497-1565
@\ Secretary: Rosemary Napier, Room 340
@\ Office Hours: After class

@b(T. A.):@\ Anthony Yeracaris
@\ YERACARIS@@SCORE, Y.YERACARIS@@LOTS-A.
@\ 960-0582 only if necessary; office hours by appointment.

@b(Text):@\ ``The Art of Computer Programming'' vol. 3, by D. E. Knuth
@\ (subtitle: ``Sorting and Searching'')


@b(Course Work):@\ There will be weekly homework assignments.
@\ Some will require computer work.  A substantial computer programming
@\ may be assigned.

@b(Examinations):@\ There will be a midterm and a final examination.

@b(Computer System):@\ Students at Stanford should use the LOTS computer system.
@\ Off-campus students may use any available computer system, including LOTS.

@b(Grades):@\ Grades will be based roughly equally on
@\ Homework
@\ Midterm (if any)
@\ Project and
@\ Final.

@end(description)
\rm

\def\today{\ifcase\month\or
  January\or February\or March\or April\or May\or June\or
  July\or August\or September\or October\or November\or December\fi
  \space\number\day, \number\year}

\line{\sevenrm 162prb.tex[v1,rfn] \today\hfill}
\vskip .5in

\noindent
CS162
Prof. Floyd

\noindent
Problem for CS162 (unused)

\vskip .5in

Initialize $K↓{[i]}=i$, $1≤i≤N$. Do CN random exchanges of adjacent elements.
Assume N is large compared to C. For i not close to 1 or N, what is the
distribution of the location of i? What is the distribution of the value of
$K[i]$?

\vskip .125in

\hrule 

\vskip .125in

[RWF note to self: both use $P(K↓i=j)$, as function of $|i-j|]$

\vskip .125in

Answer: After CN swaps, each of which has probability 2/N of swapping $K↓i$, the
variance is $CNV=CN(2/N=2C$, so the distribution is about $\exp (-x↑2/2V)=\exp
(-x↑2/4C)$.

What is $P(K↓i<K↓j))$? What is the entropy of the set of permutations?

\vfil\end


Input data to a program are  N  random real numbers between 0 and 1. The
program must print the number of inputs which are less than P.  What is
the entropy of this problem, as a function of  p  and  N?

Find a useful "closed form" (no     or     or factorials left) approximation.
(A) A binary radix sort begins by partitioning the data according to whether
they are ≥ 0.5.  Assume the  N  data are real numbers, uniformly distributed
over (0,1).  What is the expected reduction in the entropy of sorting by this
first partition?

(B) As in problem ___, find a useful approximation.
(A) Adapt the simple version of Quicksort (M=1) to locating the j   largest
of  N  data (i.e., the datum which would go in location j if the data were
sorted). (No points for this, but you'll need it for part (B) ; keep it simple.)

(B) For fixed  N, assuming values of  j  are uniformly distributed from  1  to  N,
and assuming no equal data, what is the expected number of key comparisons?
et w that  O(n lg n) comparisons are needed on the average to test whether
the number of inversions in an array is odd or even.
An interpolation search is suitable for a sorted table where the data have
been drawn from a uniform distribution.  It is analogous to bisection, but
the place where it probes in the table is based on linear interpolation
between the key values at the ends of the current interval.  As the table
size, N, becomes very large, how does the average running time of interpolation
search depend on N?  It is sufficient to find an  f(N)  that is within a constant
factor of the true result.
Suppose I sort the telephone directory entries by a recursive distribution sort,
first partitioning on the first letter, then by the second letter, etc., continuing
each subpartition until subsets of size 1 are reached.  Let  p↓i  be the frequency
in the data of the  ith  letter of the alphabet, and let  N  be the number of data.
a) Give a formula that estimates the total number of executions of the inner loop
of the algorithm as a function of  N  and the  p↓i, assuming that the letter 
frequencies are independent of context.
b) For an actual telephone directory, is the formula an overestimate or an under-
estimate of the time required?
Using 3 tapes, it is possible to sort, starting with sorted internal memory
loads ("strings") on one tape A, in two ways:
(1) Copy strings alternately on tapes B and C; rewind, and merge strings from
B and C onto A.  Repeat as needed.
(2) Copy  F↓n  strings onto B and  F↓{n-1} strings onto C, choosing  n
appropriately, where  F↓n  is the  nth  Fibonacci number.  Then merge the
strings from C with the first  F↓{n-1}  from B onto A, leaving  F↓{n-2}k
unprocessed on B.  Continue analogously.  Assume tapes can be read, written,
and rewound simultaneously at their maximum speed.  Which method is fastest
for large data sets?
A text contains  N  words.  The probability that each word is the  ith  word in
the d-word dictionary is  R↓i.  Assume independence of successive words.  An
algorithm is required to print out an exact count  C↓1, C↓2,..., C↓d  of how
many items each dictionary word occurs in the text.  What is the entropy of the
problem?  Use approximations; an asymptotically correct answer is satisfactory.
Hint: Try d=2.
I am trying to locate input  x (0≤x<1) in the sequence of intervals (y↓{i-1},y↓i),
(1≤i≤N), where  y↓0 = 0, y↓{i-1}<y↓i, and y↓N=1.  Assume  x  is uniformly 
distributed.  When I test  x<y↓i, the test takes 2 time units if true, 1 if false.
Which  y↓i  should I test against first, if I am trying to (approximately) minimize
testing time?
Represent an algorithm by a branching tree, where the leaves are labeled with
outcomes.  At each internal vertex  V, there is an operation which on the
average takes time  t(V), and on the average reduces the entropy by  I(V) bits.
If  I(V)/t(V)≥B  at all internal nodes and the problem has entropy A, show that
the expected running time is at most A/B.
An algorithm to sort random real numbers between 0 and 1 starts by partitioning
them into those less than 0.5, and those greater than or equal to 0.5.  What is
the average information obtained by this partition?  On the average, how much is
the entropy of sorting reduced?
(1) If  A[0..N]  contains a sorted list of random reals drawn from (0,1), what
is the distribution of values for A↓i?
(2) If  x  is in the table, what is the distribution of its location?  What is
the most likely location?
(3) What is the entropy of this distribution, approximately?
(4) If we look in position   j  and find there  y(>x), what is the new entropy
of the location of x?  Specifically, what if  j=xn?
Suppose the threshold value for Quicksort is chosen to be 
min(max (K↓L, K↓{L+1}), max(K↓{L+2}, K↓{L+3})), assuming M≥4 and properly
random input.  For large  N, is this better or worse than the median-of-three
method?  (Use numbe of comparisons as the standard.)
CS162
Prof. Floyd
			FINAL EXAM



We do not expect anyone to complete this exam; rather, we are providing you with
choices of what to work on.  The mean on the midterm was 59; it is not likely to
be higher on the final.  Read all questions before starting, noting weights.
If you get stuck on one problem, defer it.  We will check in at intervals to
interpret problems, etc.

1. Explain all answers.
2. State all relevant assumptions.
3. Give English-language descriptions of algorithms.

1. (25 points)
A heap is a particular implementation of a _priority queue_ - a data structure
into which data can be inserted in arbitrary order, and from which they can be
retrieved smallest (largest) first.

(A) (5 points)
A programmer (Smith) in the company you own thinks he can implement a priority
queue into which a datum can be inserted in time  0.5 lg n, and from which the
smallest datum can be deleted in time  O((lg lg n)↑2).  Should you let him work
on it?

(B) (5 points)
Another programmer (Jones) says he has a priority queue implementation which,
like the heap, has expected time  O(1)  for insertions, and  O(lg(n))  for
deletions except that if  n  is within  +/-1  of a power of 10, deletions take at
most 5 comparisons.  Do you find this plausible?

(C) (5 points)
A third programmer (Johnson) says he can modify the heap algorithm so that it
takes time  O(lg n)  to insert a datum, and constant time to delete one.  Do you
believe him?

(D) (10 points)
Outline an implementation of a priority queue having one of the three types
of performance above.
2. An ordered binary tree is one in which the internal nodes contain keys which 
are increasing in the _symmetric order_ (inorder) traversal of the tree.  It is
considered _half-balanced_ if in any subtree, the distances from the root to
the leaves differ at most by a factor of two.

(A) (5 points)
Is an R-B tree with black leaves necessarily half-balanced?  Recall that the
color of a node refers to the color of the link that points to it.  

(B)(5 points)
Given a half-balanced tree, can it be colored as an R-B tree with black leaves?  

Prove your answers.
3. (10 points)
An open question about R-B trees (if you missed the lectures on them, make it
AVL trees) is whether their insertion cost approaches  lg N,  as  N→∞, on the
average.  Assume the inputs are random real numbers in the range (0,1).  If the
conjecture is true, can one expect the value at the root to approach 1/2, and
the value at its left son to approach 1/4?
4. (10 points)  Design a data structure with these operations:

(1) _Insert_ a record, including its key
(2) _Delete_ a record, given its key
(3) _Find the median_ key (If there are   n  records, the [n/2] largest key is
    the median.  Assume all keys are distinct.  All operations should require
    O(lg(n))  time; ignore constant factors.

Please describe the data structure in detail and outline how the above operations
would be performed on it.
5. (20 points)
At a top-secret computer facility, the passwords are 20-digit numbers, each
generated randomly with the aid of good quality decimal dice.

(A) (10 points)
We want to develop a small table-driven program for a minicomputer to check
the validity of the passwords.  As a function of N, the (fixed) number of 
passwords, how much space must be allowed for the table?

(B) (10 points)
Having done the calculation of part (A), we find that our mini is not big
enough for fully reliable password checking.  We decide to design instead a
program which, by correct choice of table contents, will accept all _valid_
passwords, but only the fraction 0.0001 of all possible passwords.  Now how much
space must be provide for the table?
6. (15 points)
Using 3 tapes, it is possible to sort, starting with sorted internal memory loads
("strings") on one tape A, in two ways:

(1) Starting at the right end of A, copy alternate strings onto B and C (they will
be reversed); then read B and C backwards, merging strings pairwise onto A.
Repeat until there is a single string on A.

(2) Copy  F↓n  strings onto  B  and  F↓{n-1}  onto C,  choosing  n  appropriately,
where  n  is the  nth  Fibonacci number.  Then, reading backwards, merge the
strings of C with the last  F↓{n-1}strings  of A, placing the  F↓{n-1}  
resulting merged strings on C.  Now merge the remaining strings of B with 
the last  F↓{n-2}  of A, placing the results on C.  Continue analogously.

Problem: Which algorithm is faster?

Assumption: Reading and writing run at the same speed.  All reading is done
backwards.  Rewinding is not necessary.  Sorted strings may be increasing or
decreasing, depending on how many times their order will be reversed later in the
algorithm.  It is not necessary to specify the algorithms in detail, nor to do
a detailed algorithmic analysis; just determine which is faster.
For fixed  i,  let  t(i)  be the number of passes used by cocktail-shaker sort
to reach a configuration in which  K↓a≤K↓b  for all  a≤i<b.  What is the
distribution of  t(i)  averaging over all input permutations?  Estimate the
mean and the variance.
7. (10 points)
An algorithm working on  n  data splits them up into two subsets, the first of
size  i  and the second of size  n-i, with probability  d(i).  It then processes
the two subsets, where the cost of processing a subset of size  s  is
as↑2 + bs + c.  Relate the expected total cost of processing the two subsets to
the mean and variance of  d.
\rm
\def\today{\ifcase\month\or
  January\or February\or March\or April\or May\or June\or
  July\or August\or September\or October\or November\or December\fi
  \space\number\day, \number\year}

\line{\sevenrm convol.tex[v1,rfn] \today\hfill}

\noindent 
R. Floyd

\noindent 
CS162

\vskip .5in
\noindent
{\bf Distributions, Means, and Variances}[Feller, pp. 227-229]

\vskip .25in

If $d$ is a {\sl probability distribution}, defined on the non--negative integers,
$\sum ↑{∞}↓{i=0}d(i)=1$, each $d(i)≥0$.

\noindent
The {\sl moments} $M↓k(d)$ are  $\sum ↑∞↓{i=0}i↑kd(i),$ so $M↓0(d)=1$.

\noindent
The {\sl mean} is $M(d)=M↓1(d)$.
\vskip 5pt
\noindent
The {\sl variance} is $V(d)=M↓2(d)-M↓1↑2(d)$.  It is the average square of the
difference between an integer drawn from $d$, and $M(d)$.
\vskip 5pt
\noindent
The {\sl standard deviation \/ }is $\sigma (d)=\sqrt{V(d)}$.
\vskip 5pt

The {\sl convolution} $d↓1\bigoplus d↓2$ of two distributions is the distribution of
$x+y$, where $x$ and $y$ are drawn independently from $d↓1$ and $d↓2$.

\noindent
If $d=d↓1\bigoplus d↓{2}$, 

$d(i)=\sum↓{j+k=i}d↓1(j) d↓2(k)$

$M(d)=M(d↓1)+M(d↓2)$

$V(d)=V(d↓1)+V(d↓2)$

$\sigma (d)=\sqrt{\sigma (d↓1)↑2+\sigma (d↓2)↑2}$

\noindent
For repeated convolutions, we use $d\circ k$ to mean $d\bigoplus d\bigoplus
d\cdots\bigoplus d$ ($k$ times).

\noindent
$M(d\circ k)=kM(d)$; $V(d\circ k)=kV(d)$; $\sigma (d\circ k)=\sqrt {k} \sigma (d)$.

If $d↓1$ and $d↓2$ are distributions, and if $x$ is drawn with probability
$p↓1$ from $d↓1$ and probability $p↓2=1-p↓1$ from $d↓2$, then $x$ has distribution
$d$ where $d(i)=p↓1d↓1(i)+p↓2d↓2(i)$, or $d=p↓1d↓1+p↓2d↓2$;
$M(d)=p↓1M(d↓1)+p↓2M(d↓2)$, and $V(d)=p↓1V(d↓1)+p↓2V(d↓2)+p↓1p↓2(M(d↓1)-
M(d↓2))↑2$.

If $x$ and $y$ are drawn independently from $d↓1$ and $d↓2$ 
respectively, and  $d$ is the
distribution for $ax+by$, then 

\vskip 5pt

$d(i)=\sum↓{aj+bk=i}d↓1(j)d↓2(k)$,
\vskip 5pt

$M(d)=aM(d↓1)+bM(d↓2)$, and
\vskip 5pt

$V(d)=a↑2V(d↓1)+b↑2V(d↓2)$.

\noindent
In particular, with $a=1$, $b=-1$, the variance of $x-y$, both drawn from $d$, is
$2V(d)$, so $V(d)={1\over 2}\sum↓{i,j}d(i)d(j)(i-j)↑2$.
\vskip .5in

\noindent
{\bf Factorial powers} (Knuth vol. 1, p. 609)
\vskip 5pt

$x↑{\overline k}=x(x+1)\cdots(x+k-1)\quad (x↑{\overline 0}=1)$
\vskip 5pt

$\sum↑n↓{x=0(or\; 1)}x↑{\overline k}={n↑{\overline {k+1}}\over{k+1}}$, analogous to
$\int↑n↓0x↑kdx={n↑{k+1}\over{k+1}}$.
\vskip 5pt

\noindent
To sum any polynomial, express it as a linear combination of factorial powers,
sum them, and expand them.  The Stirling numbers (Knuth vol.1, p. 65--68) are
useful to shorten the labor.

{\sl Example}.  Let $d$ be the discrete uniform distribution on 
$(1,n)$, $d(i)={1\over n}$.
Find the usual statistics.
\vskip 5pt

\noindent
$M↓1(d)={1\over n}\sum↑n↓{i=1}i={1\over n}\sum↑n↓{i=1}i↑{\overline 1}=
{1\over n}{n↑{\overline 2}\over 2}={n+1\over 2}$
\vskip 5pt

\noindent
$M↓2(d)={1\over n}\sum↑n↓{i=1}i↑2={1\over n}
\sum↑n↓{i=1}(i↑{\overline 2}-i↑{\overline 1})=
{1\over n}({n↑{\overline 3}\over 3}-{n↑{\overline 2}\over 2})=(n+1)
({n+2\over 3}-{1\over 2})={(n+1)(2n+1)\over 6}$
\vskip 5pt

\noindent
$V(d)=M↓2-M↓1↑2={n↑2-1\over 12}$.

\noindent
$\sigma (d)\approx .288n$.

\vfill\eject \end
\input basic
\input smacros
\magnify{1200}
\item{0.} Introduction
\item{1.} Probable Complexity--Principle and Practice
\item{2.} The Scope of Problems
\item{3.} Biases and Shortcomings
\item{4.} Probability Spaces for Random Graphs
\item{5.} The Evolution of Random Graphs
\item{6.} Counting Arguments versus Algorithmic Constructions
\item{7.} Stochastic Processes Point of View
\item{8.} Rules to Maintain Randomness and Facilitate Analysis
\item{9.} Examples from Rotate-Extend Algorithms
\item{10.} Examples from Coloring Algorithms and Related Constructions
\item{11.} References
\par\vfill\eject

\section{0. Introduction}

This is not a survey article.  It evolved from seminar talks, trying to present a
point of view of an algorithmic design for a class of combinatorial problems, with
an almost-sure performance analysis in mind.  We give a basic framework, discuss
several examples drawn mainly from random-graph problems and algorithms, and put
the emphasis on the methodological aspects.  The style is informal, processes are
described and some proofs are indicated, mostly in plain English, without carrying
them out.  Details can be found in the references.

\section{1. Probable Complexity---Principle and Practice}

Usually algorithms are meant to be applied often, to many problem instances.
\par\noindent
So knowledge about the performance distribution is useful.  By performance we mean
\par\noindent
(i) resource utilization (time, space, area, etc.) (ii) quality of the output,
e.g. if the algorithm may fail, how often?  If it constructs an approximate
solution to an optimization problem, how good is the optimization?

The essence of a complexity theory is to analyze the performance, get realistic
estimates (bounds) and design algorithms approaching the best performance.

Minimizing worst-case performance (especially number of steps) has been the most
common approach---for various reasons.  Clearly it stands on unassailable grounds,
which is not the case when one sticks out his neck and imposes a probability measure
on the collection (space) of the inputs (problem instances) for the purpose of
averaging or calculating probabilities of such and such performances.  Our goal 
here is to exhibit the potential richness and usefulness of probable complexity,
provided it is done with care.  But let us mention first two celebrated cases of
basic computing processes.

{\sl Sorting}: Quicksort is popular because it is very fast, takes expected time
$cN\log N$ for some small $c$, [23], given $N$ randomly permuted keys.  Yet its
worst case is $O(n↑2)$.

{\sl Linear Programming (LP)}.  Khachyan's ellipsoid algorithm created a 
sensation, since it provably works in polynomial time in terms of input size.
But extensive experiments and some analysis tend to show its probable performance
is worse than the Simplex method, which in turn has an exponential worst case.
Recently Smale [31] has proved that the Simplex method has an expected linear
number of iterations, when a (spherical Lebesgue) measure is used for the
input space.

\section{2. The Scope of Problems}

It is hard to delineate the border between Numerical Analysis and Combinatorial
computing, but the distinction is important in many ways.  For combinatorial
problems the underlying probability spaces will be discrete and even finite (even
though one studies asymptotic behavior).  Roughly, most problems discussed in
Garey and Johnson's book [10] have a dominant combinatorial flavour, even though
they may have numerical and algebraic components.  Linear Programming sits on the
border: several Algebra texts treat 
\par\vfill\eject
\par\noindent
the Simplex method along with Gauss
elimination.  However, recent works [17] tend to show that with few variables and
many constraints, the combinatorial geometry flavor of 
\par\noindent
LP takes over.  Also,
important subfamilies such as network flow problems are solved by combinatorial
algorithms.

Our examples here are drawn primarily from Graph Theory but the scope includes
partition-like problems, optimization problems where weights or cardinalities are
involved, and networks problems.  Typically one searches for some structure or
property where exhaustive search is expensive.  We consider sequential algorithms
as well as parallel and distributed ones, and allow randomized computation steps.

Randomized algorithms [18] proved to be very efficient for various problems such
as primality, factoring of polynomials, checking identities, scynchronization,
string matchings and other combinatorial problems.  One usually studies
events in the probability space of all runs of the randomized algorithm on each
single input.  Our emphasis here is on the study of events in the input space,
but whenever random steps are used, the components of the space of runs are added
to the probability space.

REMARK: We argue below that it is useful to think of the ``read input'' instruction
as a randomized step.  But here we are not free to determine the distribution, if 
we want to have adequate measures close to ``real life'' distributions.  This
issue is discussed next.

\section{3. Biases and Shortcomings}.

Adequate probability measure for the input space of a problem may be hard to
specify.
\par\noindent
{\sl Example}: Optimal scheduling of jobs on several processors, when a precedence
relation between jobs is given.  The essence of difficulty of such scheduling in
real life is the bottlenecks, which tend to disappear if a random (non-elaborated)
precedence structure is taken.
\par\noindent
{\sl Example}: Efficient layout of graphs on planar (multilayer) grids---random
graphs fail to have bisections with a small interface.

Sometimes what happens is that considering ``most cases'' trivializes the problem---
the crux of the problem is in the worst cases which are hard to specify: E.g.
graph isomorphism, or coloring planar graphs by four colors.

An effort to get an adequate (even elaborate) probability measure for an input
space of a problem is usually worth taking.  Yet one can be accused of biasing.
This is a serious challenge in any applied probabilistic study, which one must be
ready to face.  Examples abound when the accusation is justified.  Of particular
relevance to our study is the following:
\par\vskip .125in\par
All distinct graphs on $n$ labelled vertices are equiprobable.
\par\vskip .125in\par
This measure introduces a strong bias in favor of dense graphs with $O(n↑2)$ edges;
the sparse graphs with $o(n↑2)$ edges have vanishing probability---but they occur
much more often in practice.  We return to this issue below (section 10).
\par\vfill\eject
\par
The best remedy to a strong bias is to subdivide input space to many small spaces,
i.e. introduce scales of spaces depending on several parameters {\sl and} to
diversify the mode of randomization.  Thereby much more is achieved:

\ina{(i) parameters which significantly influence the performance are revealed;}
\ina{(ii) The prediction about individual inputs becomes more precise, because}
\inb{the space and the internal variability are smaller;}
\ina{(iii) Correlation between various functions and parameters emerges.}

Indeed if one considers general graph properties and functions (performance of
algorithms is clearly included) then the logical remedy proposed was in fact
employd in a very fruitful way in the study of the Evolution of Random Graphs,
a topic shaped primarily by a series of papers of Erd\"os and R\'enyi [5,6,7]
around 1960.  A short overview of this theory is given after the probability
spaces are described.
\section{4. Probability Spaces for Random Graphs}

For graphs with $n$ labelled vertices, the next most important parameter is the
edge density.  We either fix the number of edges $L$:
\par\vskip .125in\par
$G(n,L)= \{\hbox{graphs with}\,L\,\hbox{edges; instances equiprobable}\}$
\par\vskip .125in\par
or we fix an edge probability $p$, i.e. edges occur as independent trials with
success probability $p$:
\par\vskip .125in\par
$G(n,p)= \{\hbox{all graphs; instance with}\,L\,\hbox{edges has probability}\,
p↑L(1-p)↑{{n\choose 2}{-L}}\}$.
\par\vskip .125in\par
The relation between the parameters is $L~{n\choose 2}p$, but one has to pay a small
``conversion fee'' upon translating results from one scale to the other.

One uses similar ways to define spaces of random digraphs, bipartite graphs,
hypergraphs [23], graphs on geometric grids on networks where special subsets of
edge occurrences are subject to random choice (other edges are prohibited or else
represent fixed connections).

Another useful mode of randomization (recently employed)[27,22,34] is based on 
vertex choice.  For digraphs set
\par\vskip .125in\par
$D(\lscr↓1 \ldots,\lscr↓n)= 
\{\hbox{vertex}\,i\,\hbox{randomly chooses}\,\lscr↓i\,\hbox{neighbors}\}$,
\par\vskip .125in
\par\noindent
denoted by $D(n,\lscr)$ when all $\lscr↓i$ are equal.  To get the probability
for an undirected graph $G$, just sum up over all distinct directed versions of $G$
(hence the notation $UD(n,\lscr)$).  This is a very useful ``trick'' which
physicists use so often. The graphs we observe split into (equiprobable) multiplets
of configurations which are digraphs in this case---and which are easier
to manipulate in probability analysis.
\par\vfill\eject
We wish to study the asymptotic behavior of graph properties and algorithmic
performance, as $n→∞$.  To single out sparse graphs we let $p=p(n)$ and
$L=L(n)$ vary with $n$; we might have $L(n)=c\cdot n\cdot\log n$ 
or $d\cdot n↑α\cdot$

An event $E$ in one of those spaces holds almost surely (AS) if
\par\vskip .125in\par
\inb{Prob $(GεE)=$ Prob $(E)=1-o(1)$ as $n→∞$;}

\inb{sometimes more precisely $=1-O(n↑{-α})$, $α>0$;}

\inb{or if we are lucky $=1-O(e↑{-βn})$, $β>0$.}

\section{5. The Evolution of Random Graphs}

A convenient and ``neutral'' parameter to express the edge-density is the
parameter $d(n)$:
\par\noindent
$d(n)=\hbox{Expected degree of}\,GεG(n,p)=np,\,~
\hbox{average degree of}\,GεG(n,L)={2L\over n}$

One studies random graphs as $d(n)$ grows gradually.  The striking phenomenon is:

$\bullet$ Many important properties have a sharp threshold.

$\bullet$ Many interesting graph functions are defined sharply.

An example is the connectivity threshold at $d(n)=\log n$: If
\par\vskip .125in\par
$d(n)-\log n→∞$ then Prob $\{$G is connected$\}=1-o(1)$,
\par\vskip .125in\par
$d(n)-\log n→-∞$ then Prob $\{$G is connected$\}=o(1)$.
\par\vskip .125in\par
The difference $d(n)-\log n$ may be, say, $r\log\log r$---then for $r>0$ the
function {\it min}$deg(G)$ is sharply determined, $r≤${\it min}$deg(G)≤r+1$ [7].
Thus a threshold for a property Q means that Prob (Q) rises very steeply from
$o(1)$ to $1-o(1)$ over a narrow band of $d$-values.

{\sl Fact 1}: $d(n)=1$ is a dramatic threshold for the random graph structure.
Below it, the random graph is AS composed of small components of size $O(\log n)$,
each of which is a tree or a tree with one cycle.  For $d(n)=1+λ$, $λ>0$ the
graph has one giant connected component of size $~c(λ)\cdot n$, $c(λ)$ rises
quite steeply with $λ$, so that the ``giant component'' swallows the rest of
the graph (which looks as before).

{\sl Fact 2}: $d(n)=\log n$ (or $\log n +r\log\log n$) is a threshold for many
properties related to connectivity: existence of 1-factor, r-connectivity, having
Hamiltonian paths.  For $d(n)=\alpha\log n$, $\alpha <1$, the giant connected
component, the maximal matching, the largest path, are all of size $n(1-o(1))$
(the $o(1)$ determined by $\alpha$); the vertices outside the giant component are
isolated or form very small trees.

{\sl Fact 3}: For $d(n)=\log n\cdot\omega(n)$, $\omega(n)→∞$ the random graph is
almost regular in the sense that all vertex-degrees $d↓i=d(n)\pm o(n)$, it
actually contains a regular subgraph of that size [28], and has ``regular''
behavior in many other ways.  For $d(n)=O(n)$ everything interesting about the
random graph is pretty much determined and the arguments are usually simpler.

{\sl Fact 4}: We list some functions which are sharply determined: the maximal
density ratio $e/n$ of patterns (subgraphs with $n$ vertices, $e$ edges) which
occur in G; the sizes of the largest clique, independent set, vertex-cover,
dominating set, chromatic number.  Some of these are discussed below.
\par\vfill\eject

\section{6. Counting Arguments versus Algorithmic Constructions}

In many cases thresholds are obtained by counting estimates.  For random graphs
(or other structures) one estimates:
\par\vskip .125in\par\noindent
(1) 1st moment: In how many ways the property Q holds?
\par\vskip .125in\par\noindent
(2) 2nd moment: In how many pairs of ways Q hold?
\par\vskip .125in\par\noindent
If the answer to (1) is $o$(1), then clearly we are below the threshold for Q. If
(1) is large and (2) is not too large compared to (1), then by Chebychev
inequality one shows we are above the threshold for Q (this is the 2nd moment
method).

The (1st moment) counting arguments usually play the role of weak ``information-
theoretic'' lower bounds in Complexity Theory.  Usually they should not be hard to
derive.  When a 2nd moment or another argument works, it gives a non-constructive
upper-bound---thus demonstrating a narrow threshold.  This is a feasibility proof,
and now the challenge is to design an efficient algorithm within the feasibility
range or parts of it; perhaps under some relaxations.  Notice that the property
Q may be NP-hard to determine, but we are after an almost sure efficient 
performance and there is no contradiction to $P≠NP$.  Rather, we strive to show
that even though the problem is hard in worst cases, an efficient algorithm will
crack it in most cases within  a suitable (and adequate) probability space for
the input.

{\sl Example--an even partition of 2k numbers}: Draw 2k numbers $x↓i$ independently
from a given distribution satisfying some general assumptions.  Using the 2nd
moment method one shows [16] there is a partition
$$|A|=|B|=k,\,|{\sum↓{iεA}}x↓i-{\sum↓{iεB}}x↓i|≤{{k↑2}\over{4↑k}}$$
with probability $1-O(1/k)$

Recently a linear time algorithm for constructing A and B was found [14], giving
for $\alpha >0$
$$|{\sum↓{iεA}}x↓i-{\sum↓{iεB}}x↓i|≤2↑{-2\log↑2k},$$
with probability $1-O(1/k)$.

{\sl Example: Chromatic number $\chi(G)$, independent sets, dominating sets}.
In $G(n,d(n))$, $d(n)=$ average degree $→∞$, we have
$\chi(G)≥{d(n)\over 2\log d(n)}$
AS, [30].  This lower bound is based on showing, by the 1st-moment method, that
the largest independent set is $≤{2n\cdot\log d(n)\over d(n)}$.  On the other hand
an efficient algorithm (sequential or distributed) was found [30] which colors 
G with A(G) colors where
$$\chi(G)≤A(G)≤{d(n)\over \log d(n)}(1+o(d(n))\,\hbox{AS}.$$
The upper bound is derived from a lower bound for a random dominating set being
$≥{n\cdot\log d(n)\over d(n)}(1-o(d(n))$AS.   This is discussed further below.
\par\vfill\eject
\par
It may happen (as a pleasant surprise) that the feasibility proof determining the
threshold, is achieved by an efficient algorithm (and the proof is easier, or the
only one known).  This happens in the case of graph-factors [26] and Hamiltonian
paths [25].  The reasons for this situation is the 2nd moment method fails, and
the mathematics about conditions for the property are meagre or complicated.  The
best bet of proving the property (say a certain substructure exists) is to design
an efficient process to construct the substructure and prove that it does the
job AS.  The efficiency may help in the proof (e.g. bounding rotation trees
in [25])


\section{7. Stochastic Processes point of view}

An algorithm is a process which evolves in a certain configuration space.  A clear
picture of the process and a good control in the design are essential for
extracting information about performance.  The typical difficulty in the
probabilistic study of algorithms is that randomness tends to be distorted by
algorithmic steps.  Simply constructed input distributions get messed up, one
cannot estimate output distributions.  We tackle this difficulty in the next
section.

On the other hand, the evolution of stochastic processes is a topic extensively
treated in mathematical and statistical works.  Branching processes, birth-death
processes, Markov processes, queuing models and service policies--these are the
kind of paradigms which are handy and useful for the probabilistic study of
algorithms, including parallel and distributed ones.

{\sl A Basic Rule}: Having a (draft of a) proposed algorithm to solve a problem on
a random input space, determine what is the nature of the stochastic processes
going on in it.  Try to simplify, relax or majorize by auxiliary processes which
are analyzable (perhaps even in the ``text-book'' form).
This, like many other basic rules, seems self evident, but it really helps to
organize and concentrate one's effort into a fruitful conclusion.  In the next
section we put down more specific rules.

\section{8. Rules to maintain randomness and facilitate analysis}

These rules were extracted from experience and examples (some described below).
\par\noindent
R1 {\sl Intertwined evolution of the algorithm and its input space}: Design (or
perceive) the process so that the random input space is generated ``on the fly,''
as a result of a sequence of random choice (``read next input'') steps, each such
step being delayed to the last moment, when it is requested by the algorithm
evolution.  The resulting process (and its analysis) look very much like a
randomized algorithm, events and their probabilities are easier to easimate.
In analyzing a stable marriage algorithm Knuth [15] demonstrates the advantages
of this rule, picturing the ``read input'' step to disclosing the values of face 
down cards.
\par\vfill\eject
\par\noindent
R2 {\sl Convenient presentations of the input space} To accomplish R1, but also for
other purposes, it is very useful to try various presentations of the same input
space.  E.g. when a procedure reads input from adjacency lists of a graph, a
random digraph is preferable, because elements in distinct [the same] adjacency
lists are [almost] independent.  There are simple ways to present spaces of
undirected graph via directed ones, we have mentioned  such a ``trick'' above.
\par\noindent
R3 {\sl Try a Nobleman's approach---do not recycle}. Use read-in random input
once only actively in a probability estimate.  Thereafter it can be used
``passively'' in the structure, but do not recycle it to enhance probabilities.
Noblemen can afford such a waste, which makes life easier, as it facilitates
independent arguments.  Recycling in an algorithm makes estimates harder, but
may be rewarded by a better quality output [25].
\par\noindent
R4 {\sl Design processes with an on-line character}.  Ideally this means that at
several points of the process evolution, part of the output is determined by the
input read up to that point, and moreover the rest of the process is a
recurrence of the same or similar procedure on the rest of the input.
\par\noindent
R5 {\sl Careful use of randomized computation steps}. In a broader context, this
is a whole strategy.  For our restricted context, we point out their rule in
avoiding congestion, long queues (in network packet switching [32,33]) and
avoiding deadlocks.  Also, when several actions are possible, especially in
distributed algorithms, a proper randomization preserves the ease of analysis.
\par\noindent
R6 {\sl Multiple randomization in the output}. This is a useful technique which
implements ideas in R1--R4 on a macro level. The construction of the combinatorial
object is divided into stages, in a modular fashion.  Correspondingly, one puts
the input space in a a direct sum form, each summand drives through a given
stage, thus using only the outcome, but not the inputs of previous stages.
Like in R3, we waste information but facilitate analysis.  

\par\noindent
However, if one
allocates to the stages precisely the amount of input needed to drive them
through, it is still possible to get tight results and good quality output,
despite the waste [25,29].

In the following sections we illustrate by some examples how these rules were
put to use, leading to a simplified analysis and better performance.

\section{9. Examples from Rotate-Extend Algorithms}
A standard way to search for a combinatorial structure---say a placement of $n$
queens on an $n\times n$ chessboard---is to work through a sequence of substructures
$\{ P↓i\}↑n↓{i=0}$ where $P↓{i+1}$ extends (or augments) $P↓i$ by one more
placement.  When no extension of $P↓i$ is possible we backtrack to $P↓{i-1}$ and
extend back to a $P↑'↓i$ hoping $P↑'↓i$ does extend.  Some problems enable a
{\sl rotation} of $P↓i$ to $P↑'↓i$ without a backtrack to $P↓{i-1}$.
\par\vfill\eject
\par\noindent
{\sl Example: Matchings in bipartite graphs}.  Here $P↓k$ consists of $k$ 
disjoint edges $\{ x↓iy↓i,\ 1≤i≤k\}$ and a free vertex $x$.  If we pick an edge
$xy$ and $y≠y↓i$, $1≤i≤k$, then we have an extension.  Else let $y=y↓i$; the
rotation is given by
$$P↑'↓k=P↓k-\{ x↓jy↓j\}\union \{xy↓j\}.$$
\par\noindent
In $P↑'↓k$ the vertice $x↓j$ is now free (unmatched).  The essence of Extend
Rotate (ER) methods is to compose rotations until an extension is achieved
(or becomes highly probable).  In the context of matchings, factors and network
flow problems, a sequence of rotations gives rise to an ``augmenting path''
on which most algorithms are based.
\par\noindent
{\sl Example: long paths in graphs}. Here $P↓k$ is a path with $k+1$ vertices;
we keep an end $x↓0$ anchored, and look for an extension of the other end $x$.
An edge $xy$ with $yεP↓k$ {\sl rotates} as depicted in fig. 1:
\par\vskip 2in
\ctrline{Fig. 1: $xy$ rotates, $yz$ deleted}
The new order in $P↓k↑'$ is from $x↓0$ to $y$ to $x$ to $z$, and $z$ is the new
end.
\par\vskip .125in\par
There are several ways to incorporate Extend-Rotate in algorithms designed for
the goal of a complete (or maximal, or prescribed size) structure
(e.g. a matching, or a simple path).  The basic idea in a
probable analysis of such algorithms is to estimate the distribution of the
number of rotations per extension. Some authors carried through complicated
analyses [2,20].  We shall sketch how the rules we set down lead to a simplified
analysis and even better performance.

The ER is done from an end vertex $x$, so let the design be oriented to read input
from adjacency lists, in one direction only (read and erase) if we adopt R3
(no recycling). We prefer the presentation of random {\sl digraphs} (R2) which are
generated on the fly as random elements from the list of $x$ are read in.

What processes go on if we design along these lines? There is a race between two
processes. One is the increase in $k$, the number of points on the partial path
(structure) $P↓k$, which occurs each time a new vertex $y\mathrel{\not ε}P↓k$ 
is read in (then we extend). The statistical paradigm is the Coupon-Collector model
[9], which analyzes how many random choices are needed to increase a collection
from $k$ to $k+1$ (and globally from $0$ to $n$).
\par\vfill\eject
The other process in the race is the depletion of the lists, driving to a
(premature) failure.  Indeed when the read-in choice is first made from the
emptied list of $x$ and we do not allow a backtrack, we failed unless the current
structure $P↓k$ (say a path) is already our goal.
The equations (or inequalities) governing this depletion (``death'') process are
not hard.  The initial distribution for lists' lengths is derived from the input
random graph space, say $D↓{n,p}$ (directed graphs, edge probability $p$). We
determine the value of $p,$\par\noindent
$(p>5{\log n\over n})$, for which the collection process
wins AS, hence a complete Hamiltonian path is constructed (efficiently) before
a list is emptied.

A design of a quick distributed algorithm for a perfect matching [29] illustrates
nicely the use of multiple randomization.  It consists of a succession of
(identically coded) stages (called phases). In each stage the number of unmatched
vertices decreases by a constant factor (hence a geometric decrease sequence and
$C\cdot\log n$ stages are needed). In driving through a stage, each vertex makes
two new choices of neighbors (altogether $2C\log n$), these are used exclusively
in the growth analysis of rotation trees which are constructed (in parallel) at
this stage. A double randomization is also used in [1,25].

A deep probable analysis of a simple efficient algorithm (which does not use
rotations) for a maximum matching in sparse graphs was carried out in [13], showing
that the maximum matching in $G(n,p={λ\over n})$ is sharply determined to be
$f(λ)\cdot n$, and for any $ε>0,\ (f(λ)-ε)n$---performance is achieved by the
algorithm. The analysis using Markov processes is quite involved.

\section{10. Examples from coloring algorithms and related constructions}.

A coloring of a graph by $k$ colors divides the vertex set in to $k$ independent
sets $I↓j,\ 1≤i≤k$. An AS upper bound on the size of any independent set is easy
to obtain, and it gives an AS lower bound on $k$.

It was shown 10 years ago [4,11,5] that the quick greedy algorithm, 
which colors the vertex sequence $(v↓1,\ldots,\ v↓n)$ by letting $v↓i$
have the minimal color not occupied by neighboring vertices $v↓i,\ i<j$, 
gives a pretty sharp output, not more than $(2+ε)$ times the lower bound on $k$, 
AS in the space $G(n,\ p=\hbox{constant})$.
Since finding maximum cardinality independent set and good
approximation to minimal coloring are both NP-hard [10], the story seemed closed.

We reopen it, because of the strong bias.  In $G(n,\ p=\hbox{constant})$ all the
sparse groups with $o(n↑2)$ (which are very important in theory and applications)
have vanishing probability.  So any result for $p=\hbox{constant}$ has a limited
value. Indeed, if we try to extrapolate the prediction for the number of colors,
${n\over\log n}\log({1\over 1-p})$, for $p→0$ we get ${np\over\log n}={d(n)\over
\log n}$. For $d(n)=\log n$ we get 1, a very bad prediction.
\par\vfill\eject
It was conjectured in [8,page 96,9(c)] that ${d(n)\over\log d(n)}$ is, up to a
factor of 2, the sharp value for the chromatic number $k$ for $G(n,\ d(n))$,
for any expected degree function $d(n)→∞$.

The key to proving this result by an efficient algorithm was first to reformulate
the greedy algorithm according to our design rule R4. If we start with a random
ordering of vertices, the first chromatic set in the greedy algorithm is a
randomly constructed maximal independent set $I↓1$.  Such a set is necessarily
dominating (i.e. connected to any vertex in the graph).  The size of a random
dominating set is sharply determined by $d(n)$. By peeling off $I↓1$ we get a
random graph on $m$ vertices ($m$ itself is random), but with unchanged edge
probability hence a reduced expected degree $d(m)=d(n)\cdot{m\over n}$; we
iterate this on-line process, peeling of chromatic sets from a shrinking random
graph.

Now that we have a good control of the process, we can modify it by
{\sl truncation} if it goes on too many stages to produce a bad output, which it
tends to do if $p(n)→0$ ($d(n)$ rises slowly). We truncate it after 
${d(n)\over\log d(n)}(1+ε)$ stages (so the algorithm depends on the space
parameter) and prove that AS the shrinking graph gets below expected degree 1,
where it AS consists of trees and three additional colors suffice.

Why don't we peel off a maximum cardinality (not just random maximal) independent
sets? Because they are NP hard to find {\sl and} (this is related) their
determination involves all edges of the graph; peeling off one will not result in
a random graph---this design does not follow R4.

Coloring is a fundamental graph property, important in applications and hard to
optimize, even approximately.  Even for random graphs there are further aspects
we have not discussed such as efficient distributed algorithms and performance
improvement, e.g. descreasing algorithm failure probability to be exponentially
small.
\par\vfill\eject
\section{11. References}

\item{1.} M. Ajtai, J. Koml\'os, and E. Szemer\'edi, The longest path in a
random graph. {\sl Combina-torica} 1 (1981), 1-12.
\item{2.} D. Angluin and L. Valiant, Fast probabilistic algorithms for Hamiltonian
circuits and matchings. J. {\sl Comp System Sci. 18} (1979), 155-193.
\item{3.} B. Bollob\'as, {\sl Graph Theory--an Introductory Course}. Springer,
N.Y. 1979.
\item{4.} B. Bollob\'as and P. Erd\"os, Cliques in random graphs, {\sl Proc.
Cambridge} Phil. Soc. {\sl 80} (1976), 419-437.
\item{5.} P. Erd\"os, {\sl The Art of Counting--Selected Writings}. (J. Spencer,
Ed.) The M.I.T. Press, Cambridge 1973.
\item{6.} P. Erd\"os and A. R\'enyi, The evolution of random graphs. {\sl Publ.
Math. Inst. Hung. Acad. Sci. 5A} (1960), 17-61.
\item{7.} P. Erd\"os and A. R\'enyi, On the strength of connectedness in random
graphs. {\sl Acta Math. Acad. Sci. Hung. 12} (1961), 261-267.
\item{8.} P. Erd\"os and J. Spencer, {\sl Probabilistic Method in Combinatorics}.
Academic Press, N.Y. 1974.
\item{9.} W. Feller, {\sl An Introduction to Probability Theory and its 
Applications}, Wiley N.Y., 1957.
\item{10.} M.R. Garey and D.S. Johnson, {\sl Computers and Intractability},
Fredman and co. San Francisco, 1979.
\item{11.} G.R. Grimmet and C.J.H. McDiardmid, On coloring random graphs. Proc.
{\sl Cambridge Phi. Soc. 77} (1975), 314-324.
\item{12.} R.M. Karp, The probabilistic analysis of some combinatorial search
algorithms. In {\sl Algorithms and Complexity} (J.F. Traub, Ed.) Academic Press
N.Y., 1976.
\item{13.} R.M. Karp and M. Sipser, Maximum matchings in sparse random graphs,
22 {\sl Annual FOCS Symp.} (1981), 364-375.
\item{14.} R.M. Karp et al. Personal communication.
\item{15.} D.E. Knuth, {\sl Mariage Stable.}, Les Presses de l'universite de
Montreal, 1976.
\item{16.} G.S. Lueker, A theorem on sums of subsets of random variables (Draft,
1982).
\item{17.} N. Meggido, {\sl Solving linear programming in linear time when dimension
is fixed.} (Statistics Dept. Tel-Aviv Univ.), 1982.
\item{18.} L. P\'osa, Hamiltonian circuits in random graphs. {\sl Discrete Math.
14} (1976), 359-364.
\item{19.} M.O. Rabin, Probabilistic Algorithms. In {\sl Algorithms and 
Complexity} (J.F. Traub, Ed.) Academic Press N.Y., 1976.
\item{20.} J.H. Reif and P.G. Spirakis, Random matroids, {\sl 12 Annual ACM
Symp. on Theory of Computing} (1980), 385-397.
\item{21.} J. Schmidt and E. Shamir, An improved program for constructing open hash
tables. {\sl Proc. of 7th ICALP Colloquium}, (1980), 569-580.
\item{22.} J. Schmidt and E. Shamir, A threshold for perfect matchings in random
d-pure hypergraphs, to appear in {\sl Discrete Math.}.
\item{23.} R. Sedgewick, Quicksort. STAN-CS-75-492, Dept. of Comp. Sci. Stanford
Univ., May 1975.
\item{24.} E. Shamir. Stochastic Analysis of extension-rotation algorithms
(Hamiltonian tours for the nobility) submitted to {\sl STOC Symp.} 1982.
\item{25.} E. Shamir, How many random edges make a graph Hamiltonian? Submitted to
{\sl Israel J. of Math.}
\item{26.} E. Shamir and E. Upfal, On factors in random graphs. {\sl Israel J. of
Math. 39} (1981), 296-302.
\item{27.} E. Shamir and E. Upfal, One factor in random graphs based on vertex
choice. To appear in {\sl Discrete Math.}
\item{28.} E. Shamir and E. Upfal, Large regular factors in random graphs. To
appear in {\sl Annals of Discrete Math.}
\item{29.} E. Shamir and E. Upfal, N-processors graphs distributively achieve
perfect matchings in $O(\log↑2N)$ beats {\sl Proc. of ACM Symp. on Principles
of Distributed Computing}, Ottawa, 1981.
\item{30.} E. Shamir and E. Upfal, Sequential and distributed graph coloring
algorithms with performance analysis in random graph spaces. Submitted to
{\sl J. of Algorithms.}
\item{31.} S. Smale, On the average speed of the simplex method for linear
programming. Manuscript, Berkeley, 1981.
\item{32.} E. Upfal, Efficient schemes for parallel communication. {\sl loc.
cit in} 29.
\item{33.} L.G. Valiant, A scheme for fast parallel communication. {\sl SIAM
J. on Computing 11} (1982), 350-361.
\item{34.} D. Walkup, Matchings in random regular bipartite digraphs. {\sl
Discrete Math. 31} (1980), 59-64.
\item{35.} B. Weide, {\sl Statistical methods in algorithm design and analysis}.
Ph.D. Dissertation, Comp. Sci. Dept. Carnegie Mellon Univ. 1978.
\par\vfill\end